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© Method and apparatus for encoding and decoding digital audio signals and apparatus for recording 
digital audio 



© An encoding method and apparatus for encoding 
multi-channel signals employed in , for example, a 
stereo system of a video disc player, a video tape 
recorder, a motion picture film picture system, or a 
so-called multi-surround acoustic system. Five chan- 
nels, namely the center (C) channel, left (L) channel, 
right (R) channel, left surround (SL) channel and the 
right surround (SR) channel, for example, are han- 
dled in common depending upon frequency char- 



acteristics of digital audio signals and the targeted 
playback environment, and encoding (105) is done 
while the combinations of the channels to be han- 
dled in common are altered. High compression may 
be achieved with the use of pre-existing encoding 
and decoding units by handling the channels in 
common without dependency upon the degree of 
correlation of multi-channel digital data. 
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BACKGROUND OF THE INVENTION 

This invention relates to an encoding method 
and apparatus for encoding multi-channel signals 
employed in, for example, a stereo system of a 
video disc player, a video tape recorder, a motion 
picture film picture system, or a so-called multi- 
surround acoustic system. The invention also re- 
lates to a corresponding decoding method and 
apparatus, and a recording medium. 

There are a variety of techniques of high effi- 
ciency encoding of audio signals or speech signals. 
An example of these techniques is transform cod- 
ing in which a frame of digital signals representing 
the audio signal on the time axis is converted by 
an orthogonal transform into a block of spectral 
coefficients representing the audio signal on the 
frequency axis. 

There is also known a sub-band coding in 
which the frequency band of the audio signal is 
divided by a filter bank into a plurality of sub-bands 
without forming the signal into frames along the 
time axis prior to coding. In addition, there is 
known a combination of sub-band coding and 
transform coding, in which digital signals represent- 
ing the audio signal are divided into a plurality of 
frequency ranges by sub-band coding, and trans- 
form coding is applied to each of the frequency 
ranges. 

Among the filters for dividing a frequency 
spectrum into a plurality of equal-width frequency 
ranges include the quadrature mirror filter (QMF) as 
discussed in R E. Crochiere, Digital Coding of 
Speech in Sub-bands. 55 Bell Syst. Tech J. No.8 
(1976). With such QMF filter, the frequency spec- 
trum of the signal is divided into two equal-width 
bands. With the QMF, aliasing is not produced 
when the frequency bands resulting from the di- 
vision are subsequently combined together. In 
"Polyphase Quadrature Filters- A New Subband 
Coding Technique", Joseph H. Rothweiler ICASSP 
83. Boston, there is shown a technique of dividing 
the frequency spectrum of the signal into equal- 
width frequency bands. With the present polyphase 
QMF. the frequency spectrum of the signals can be 
divided at a time into plural equal-width frequency 
bands. 

There is also known a technique of orthogonal 
transform including dividing the digital input audio 
signal into frames of a predetermined time dura- 
tion, and processing the resulting frames using a 
discrete Fourier transform (DFT), discrete cosine 
transform (DCT) and modified DCT (MDCT) for 
converting the signal from the time axis to the 
frequency axis. Discussions on MDCT may be 
found in J.P. Princen and A.B. Bradley, Subband 
Transform Coding Using Filter Bank Based on 
Time Domain Aliasing Cancellation", ICASSP 1987. 



By quantizing the signals divided on the band 
basis by the filter or orthogonal transform, it be- 
comes possible to control the band subjected to 
quantization noise and psychoacoustically more ef- 
5 ficient coding may be performed by utilizing the 
so-called masking effects. If the signal components 
are normalized from band to band with the maxi- 
mum value of the absolute values of the signal 
components, it becomes possible to effect more 

10 efficient coding. 

In a technique of quantizing the spectral coeffi- 
cients resulting from an orthogonal transform, it is 
known to use sub bands that take advantage of the 
psychoacoustic characteristics of the human au- 

15 ditory system. That is, spectral coefficients repre- 
senting an audio signal on the frequency axis may 
be divided into a plurality of critical frequency 
bands. The width of the critical bands increase with 
increasing frequency. Normally, about 25 critical 

20 bands are used to cover the audio frequency spec- 
trum of 0 Hz to 20 kHz. In such a quantizing 
system, bits are adaptively allocated among the 
various critical bands. For example, when applying 
adaptive bit allocation to the spectral coefficient 

25 data resulting from MDCT, the spectral coefficient 
data generated by the MDCT within each of the 
critical bands is quantized using an adaptively al- 
located number of bits. 

There are presently known the following two bit 

30 allocation techniques. For example, in IEEE Trans- 
actions of Acoustics, Speech and Signal Process- 
ing, vol. ASSP-25, No.4, August 1977, bit allocation 
is carried out on the basis of the amplitude of the 
signal in each frequency band. 

35 In the bit allocation technique described in 

M.A. Krassner, The Critical Band Encoder- Digital 
Encoding of the Perceptual Requirements of the 
Auditory System. ICASSP 1980, the 
psychoacoustic masking mechanism is used to de- 

40 termine a fixed bit allocation that produces the 
necessary signal-to-noise ratio for each frequency 
band. 

In the high-efficiency encoding system for 
audio signals making use of the above-mentioned 

45 subband coding or the like, such a system in which 
audio data is compressed to about 1/5 by taking 
advantage of the characteristic of the hearing 
sense of the human being has been put to practice. 
That is. there is a system called ATRAC (Adaptive 

so Transform Acoustic Coding, trade name by SONY 
Corporation) used in e.g. MiniDisc (trade name by 
SONY Corporation), a magneto-optical disc 64 mm 
in diameter, as the efficient encoding system of 
compressing audio data so that its data quantity 

55 becomes equal to about 1/5. 

In a stereo or multi-surround audio system for 
a motion picture film motion picture system, high 
definition television, video tape recorder or a video 
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disc player, as well as the common audio equip- 
ments, the tendency is towards handling audio or 
speech signals of a plurality of, e.g., four to eight, 
channels. It is desired in this case to reduce the bit 
rate by way of high efficiency encoding. 

Above all, when recording digital audio signals 
of eight channels, namely left channel, left center 
channel, center channel, right center channel, right 
channel, left surround channel, right surround chan- 
nel and sub-woofer channel, on a motion picture 
film, a necessity arises for high efficiency encoding 
of reducing the bit rate. That is, an area sufficient 
to record eight channels of 16-bit linear-quantized 
audio data at a sampling frequency of 44.1 kHz is 
difficult to hold on the motion picture film, thus 
necessitating compression of the audio data. 

The channels of the eight channel data re- 
corded on the motion picture film are associated 
with a left speaker, a left center speaker, a center 
speaker, a right center speaker, a right speaker, a 
surround left speaker, a surround right speaker, 
and a sub- woofer speaker, which are disposed on 
the screen side where a picture reproduced from 
the picture recording area of motion picture films 
are projected by a projector. The center speaker is 
disposed at the center on the screen side, and 
serves to output reproduced sound by audio data 
of center channel. The center speaker output the 
most important reproduced sound, such as speech 
of the actor. 

The sub-woofer speaker serves to output re- 
produced sound by audio data of sub-woofer chan- 
nel. The sub- woofer speaker effectively outputs 
sound which feels as vibration rather than sound in 
low frequency range, such as sound of explosion, 
and is frequently used effectively in scene of ex- 
plosion. The left speaker and the right speaker are 
disposed on left and right sides of the screen, and 
serve to output reproduced sound by audio data of 
left channel and reproduced sound by audio data 
of right channel, respectively. These left and right 
speakers exhibit stereo sound effect. The left cen- 
ter speaker is disposed between the left speaker 
and the center speaker, and the right center speak- 
er is disposed between the center speaker and the 
right speaker. The left center speaker outputs re- 
produced sound by audio data of left channel, and 
the right center speaker outputs reproduced sound 
by audio data of right center channel. These left 
and right center speakers perform auxiliary roles of 
the left and right speakers, respectively. Above all, 
in movie theaters having large screen and large 
number of persons to be admitted, there is the 
drawback that localization of sound image becomes 
unstable in dependency upon seat positions. How- 
ever, the above-mentioned left and right center 
speakers are added to thereby exhibit effects in 
creating more realistic localization of the sound 



image. 

In addition, the surround left and right speakers 
are disposed so as to surround the spectator's 
seats. These surround left and right speakers serve 

5 to respectively output reproduced sound by audio 
data of surround left channel and reproduced 
sound by audio data of surround right channel, and 
have the effect to provide reverberation or impres- 
sion surrounded by hand clapping or shout of joy. 

io Thus it is possible to create sound image in more 
three-dimensional manner. 

In addition, since defects are apt to take place 
on the surface of a medium of a motion picture 
film. If digital data is recorded as it is, data missing 

75 takes place frequently. Such a recording system 
cannot be employed from a practical point of view. 
For this reason, the ability of error correcting code 
is very important. 

Accordingly, with respect to data compression, 

20 it is necessary to carry out compression process- 
ing to such a degree that recording can be made in 
the recording area on the film by taking bits for 
correcting code into consideration. 

In this consideration, as the method of com- 

25 pression processing of digital audio data of eight 
channels as described above, there is applied the 
high efficiency encoding system, such as the AT- 
RAC system, which achieves high quality com- 
parable to CD by carrying out optimum bit alloca- 

30 tion by taking the above-mentioned characteristics 
of the hearing sense into account as described 
above, while compressing the 16-bit digital audio 
data to about 1/5 with the sampling frequency of 
44.1 kHz. 

35 However, the high efficiency encoding system 

of compressing the digital audio data to about 1/5 
is the encoding system for a single channel. If this 
system is employed for encoding multi-channel 
audio data, it is not possible to achieve effective 

40 data encoding employing data interdependency 
among different channels or such elements as data 
or format characteristics of the respective channels. 

On the other hand, since the hearing sense of 
the human being tends to be unstable in direction 

45 feeling with respect to the sound in the high fre- 
quency range, there is known a method of encod- 
ing data in common among respective channels in 
the high frequency range and to record the data 
thus encoded in common for diminishing the re- 

50 cording area. However, since the level difference 
can be perceived, even although the direction feel- 
ing of the sound becomes indefinite, it occurs 
frequently that changes in the sound field be per- 
ceived by the hearer on multichannel reproduction, 

55 particularly if the correlation among different chan- 
nels is low. 
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SUMMARY OF THE INVENTION 

In view of the foregoing, it is an object of the 
present invention to provide a signal encoding 
method and apparatus, a signal decoding method 
and apparatus and a recording medium, in which 
high compression may be achieved in multi-chan- 
nel signal encoding using pre-existing encoding 
and decoding units without dependency upon the 
correlation of the digital data among the respective 
channels. 

In one aspect, the present invention provides 
an encoding method for encoding digital signals of 
plural channels and outputting the encoded digital 
signals and the parameter information for encoding, 
including the steps of handling the digital signals of 
at least a part of the channels in common to form a 
common digital signal, altering the combinations of 
channels handled in common depending upon fre- 
quency characteristics of the digital signals or the 
targeted playback environment, and encoding the 
common digital signal. The present invention also 
provides an encoding apparatus for carrying out 
the encoding method. 

In another aspect, the present invention pro- 
vides a decoding apparatus for decoding encoded 
digital signals using parameters for encoding, 
which encoded digital signals are such signals in 
which part or all of digital signals of plural channels 
are handled as one or more common signals. The 
combinations of channels for common handling can 
be altered in dependence upon frequency char- 
acteristics of the digital signals and the targeted 
playback environment. The decoding apparatus in- 
cludes decoding means for decoding the common 
signals, distributing means for distributing the de- 
coded common signals in dependence upon the 
combinations of common handling, and decoding 
means for restoring the decoded common signals 
of plural channels. 

In still another aspect, the present invention 
provides a recording medium having recorded 
thereon such a signal in which part or all of digital 
signals of plural channels are handled as one or 
more common signals and encoded, the parameter 
information specifying the combinations of chan- 
nels to be handled in common, an encoded signal 
other than the common signals and the parameter 
information for encoding, in addition to the param- 
eter information concerning the encoding. The 
combinations of channels for common handling are 
altered in dependence upon frequency characteris- 
tics of the digital signals and the targeted playback 
environment. 

With the encoding method and apparatus of 
the present invention, the digital signals of at least 
a part of plural channels are handled as common 
signals and encoded for raising the compression 



ratio. The combinations of channels to be handled 
in common or the processing method for handling 
the signals in common are altered in dependence 
upon the targeted or recommended playback envi- 

5 ronment for suppressing changes in the sound field 
otherwise caused by common handling if the digital 
signals are audio signals. 

It is possible with the encoding method and 
apparatus of the present invention to evade unsta- 

io ble sound field due to sudden changes in the 
processing method of handling of common data or 
in the combinations of channels to be handled in 
common. 

With the decoding apparatus of the present 
15 invention, digital signals of plural channels are de- 
coded from at least one signal handled in common, 
and the processing method for handling the com- 
mon signals is altered in dependence upon the 
recommended playback environment for the en- 
20 coded signals for suppressing changes in the 
sound field produced by common handling if the 
digital signals are audio signals. 

With the recording medium, such as an optical 
disc or a motion picture film, of the present inven- 
25 tion, having recorded thereon the signals encoded 
in accordance with the encoding method and ap- 
paratus of the present invention, it becomes possi- 
ble to provide a stabilized sound field. 

30 BRIEF DESCRIPTION OF THE DRAWINGS 

Fig.1 is a schematic block circuit diagram 
showing a configuration of a multi-channel audio 
data encoding apparatus embodying the present 
35 invention. 

Fig.2 is a block circuit diagram showing a 
configuration of a multi-channel audio data encod- 
ing apparatus not employing the present invention. 

Fig.3 is a block circuit diagram showing a 
40 configuration of a common handling analyzer of the 
encoding apparatus embodying the present inven- 
tion. 

Fig.4 illustrates different sorts of selection of 
channels for common handling embodying the 
45 present invention. 

Fig.5 illustrates changes between surround 
frames of the channel for common handling. 

Fig.6 is a schematic block circuit diagram of a 
multi-channel audio signal decoding apparatus em- 
50 bodying the present invention. 

Fig.7 is a block circuit diagram showing a 
configuration of a multi-channel audio decoding 
apparatus not employing the present invention. 

Fig.8 is a block circuit diagram showing a 
55 modification of a multi-channel audio data decoding 
apparatus of the present invention. 

Fig. 9 is a block circuit diagram showing an 
illustrative configuration for implementing an en- 
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coding method of parameters for common handling 
embodying the present invention. 

Fig. 10 is a block circuit diagram showing an- 
other illustrative configuration for implementing an 
encoding method of parameters for common han- 5 
dling embodying the present invention. 

Fig.11 is a block circuit diagram showing an 
illustrative configuration for implementing a decod- 
ing method of parameters for common handling 
embodying the present invention. 10 

Fig. 12 is a block circuit diagram showing an- 
other illustrative configuration for implementing a 
decoding method of parameters for common han- 
dling embodying the present invention. 

Fig.13 illustrates different sorts of selection of rs 
channels for common handling for seven channels 
embodying the present invention. 

Fig. 14 is a block circuit diagram showing an 
illustrative configuration of an encoding unit of the 
encoding apparatus embodying the present inven- 20 
tion. 

Fig. 15 is a block circuit diagram showing an 
illustrative configuration of a bit allocator of the 
encoding unit. 

Fig.16 is a graph for illustrates a Bark spectrum 25 
and the masking threshold level. 

Rg.17 is a graph showing a signal level, a 
minimum audibility curve and a masking threshold 
level synthesized together. 

Fig.18 is a block circuit diagram showing an 30 
illustrative configuration of a decoding unit of a 
decoding apparatus embodying the present inven- 
tion. 

Fig.19 illustrates recording positions of en- 
coded signals on a motion picture film. 35 

Fig.20 illustrates header data of an encoded 
bitstream for respective channels. 

Fig.21 is a diagrammatic view showing a con- 
figuration of an encoded bitstream. 

Fig.22 is a block circuit diagram showing a 40 
configuration of an analyzer for common handling 
of another encoding apparatus of the present in- 
vention. 

DESCRIPTION OF THE PREFERRED EMBODI- 45 
MENTS 

Referring to the drawings, illustrative embodi- 
ments of the present invention will be explained in 
detail. 50 

Fig.l shows a configuration of an encoder (en- 
coding apparatus) to which the encoding method of 
the resent invention is applied. The encoder of the 
present embodiment is configured to implement 
multi-channel encoding using a plurality of single- 55 
channel encoding units, such as encoding units of 
the above-mentioned encoding units of the ATRAC 
system. 



That is, the encoder of the present embodi- 
ment is configured to encode digital audio signals 
of plural channels and to output the parameter 
information for encoding along with encoded digital 
audio signals. The encoder includes an analyzer for 
common handling 102, a common data formulator 
104 and encoding units 105f to 105g as means for 
handling part or all of the digital audio signals of 
plural channels as one or plural common signals, 
modifying channel combinations carrying out the 
common handling in dependence upon frequency 
characteristics of the digital audio signals and the 
targeted playback environment and encoding the 
common signals. 

Fig.2 shows, for comparison with the encoder 
of the embodiment of the present invention, a con- 
figuration of a multi-channel encoder effecting 
channel-based compression encoding, that is a 
multi-channel encoder not employing the present 
invention. For facilitating the understanding, similar 
portions of Figs.1 and 2 are represented by the 
same reference numerals and the corresponding 
description is omitted. 

In the embodiment of Fig.1, explanation 's 
made using audio data of five channels, that is a 
center (C) channel, a left (L) channel, a right (R) 
channel, a left surround (SL) channel and a right 
surround (SR) channel. A 5-1 channel can be con- 
stituted by adding a sub-woofer channel for ultra- 
low frequency. The configuration of Fig.2 is ex- 
plained before explanation of Hg.1. Audio data" of 
the center (C) channel, left (L) channel, right (R) 
channel, left surround (SL) channel and the right 
surround (SR) channel, fed via input terminals 101a 
to 101e, are routed to single-channel encoding 
units 105a to 105e, respectively. These encoding 
units 105a to 105e carry out encoding as later 
explained. The resulting encoded data are fed to a 
multiplexor 106 where the encoded data of the 
respective channels are multiplexed into a single 
bitstream which is outputted at an output terminal 
107. 

The bitstream from the output terminal 107 is 
recorded on a recording medium 109, such as an 
optical disc or a cinema film, by a recording unit 
108 which effects processing such as appendage 
of error correction codes or modulation. Alternative- 
ly, the bitstream is transmitted over a cable or by 
radio transmission by a pre-set communication de- 
vice. 

On the other hand, with the encoder of the 
embodiment of the present invention, audio data of 
the center (C) channel, left (L) channel, right (R) 
channel, left surround (SL) channel and the right 
surround (SR) channel, supplied via the input termi- 
nals 101a to I01e, are entered to an analyzer for 
common handling 102. The analyzer for common 
handling 102 selects a technique for common han- 
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dling effective among the different channels, and a 
range to be handled in common, and selectively 
outputs only the portion of the audio data of the 
respective channels that are to be handled in com- 
mon. If common handling is not performed, nothing 
is outputted. 

Outputs of the analyzer for common handling 
102 are entered to common data extractors 103a to 
I03e of associated channels. The common data 
extractors extract common portions from the origi- 
nal audio data from channel to channel and trans- 
mit only remaining portions to the encoders 105a 
to 105e. The internal configuration of the encoding 
units 105a to 105e is substantially equivalent to 
that of the encoding units of Fig.2 and hence the 
detailed description is omitted. 

An output of the analyzer for common handling 
102 is also routed to a common data formulator 
104 which collects common data of the respective 
channels to form one or plural common data which 
is outputted. 

On the other hand, the encoding units 105a to 
105e encode outputs of the common data extrac- 
tors 103a to 103e, while the encoding units 105f 
and 105g encode one or plural common data out- 
putted from the common data formulator 104. 

These encoding units 105a to 105 output the 
parameter information of common handling used 
for encoding, along with respective encoded data, 
to the multiplexor 106, which multiplexes outputs of 
the encoding units 105a to 105g to form a bit- 
stream which is outputted at an output terminal 
107. 

The bitstream from the output terminal 107 is 
recorded on a recording medium 109, such as an 
optical disc or a motion picture film, by a recording 
unit 108 which effects processing such as appen- 
dage of error correction codes or modulation. Alter- 
natively, the bitstream is transmitted over a cable 
or by radio transmission by a pre-set communica- 
tion device. 

Fig.3 shows an internal configuration of the 
analyzer for common handling 102. 

Referring to Fig. 3, the audio data of the center 
(C) channel, left (L) channel, right (R) channel, left 
surround (SL) channel and the right surround (SR) 
channel, supplied via the input terminals 121a to 
121e, are selectively transmitted to analyzers for 
common handling 122a to 122i, where it is ana- 
lyzed whether the use of the techniques for com- 
mon handling is effective, and the results of analy- 
ses are outputted. The analyzers for common han- 
dling 122a to 122i are associated with respective 
selections (combinations) of the channels to be 
handled in common. 

Outputs of the analyzers for common handling 
122a to 122i are all sent to a selector for tech- 
niques of handling in common 123. The selector 



for techniques of handling in common 123 pref- 
erentially selects a system which allows to handle 
as many channels as possible, that is an output of 
that of the analyzers for common handling qq122a 
5 to 122i which is disposed more towards left in 
Fig.3. Thus the selector determines the technique 
for handling in common and the range of handling 
in common in order to output the results of selec- 
tion. 

w The common data extractors 124 extract only 

the portions of data of the respective channels of 
the input terminals 121a to 12e, based upon data 
to be handled in common, obtained from the tech- 
nique selector 123, and output the extracted data at 
75 associated output terminals 125a to 125e. Data on 
common handling, that is the parameter information 
for handling in common, as supplied from the tech- 
nique selector 123, is outputted to data extractors 
for extracting commonly handled data 124a to 124e 
20 as pre-set frames on the sound frame basis. 

Taking an example of five channels, namely 
the center (C) channel, left (L) channel, right (R) 
channel, left surround (SL) channel and the right 
surround (SR) channel, the sorts of selection of 
25 channels to be handled in common will be ex- 
plained with reference to Fig.4. 

Fig.4A indicates that all channels are to be 
handled in common. This corresponds to the ana- 
lyzer 122a which analyzes effectiveness of the 
30 techniques for handling in common. 

Fig.4B indicates that three channels, namely 
the center (C) channel, left (L) channel and the left 
surround (SL) channel are to be handled in com- 
mon as the left-route channel, while the three chan- 
35 nels, namely the center (C) channel, the right (R) 
channel and the right surround (SR) channels, are 
to be handled in common as right-route channels. 
These correspond to the analyzers for common 
handling 122b and the analyzers for common han- 
40 dling 122c, respectively. 

Fig.4C indicates that two channels, namely the 
center (C) channel and the left surround (SL) chan- 
nel, are to be handled in common as the left-route 
channel, while two channels, namely the right (R) 
45 channel and the right surround (SR) channels, are 
to be handled in common as right-route channels. 
These correspond to the analyzers for common 
handling I22d and the analyzers for common han- 
dling I22e, respectively. 
so Fig.4D indicates that two channels, namely the 

center (C) channel, left channel (L) and the right (R) 
channel are to be handled in common as the 
forward-route channel, while two channels, namely 
the left surround (SL) channel and the right sur- 
55 round (SR) channel, are to be handled in common 
as backward-route channels. These correspond to 
the analyzers for common handling 122f and the 
analyzers for common handling 122g, respectively. 
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Fig.4E indicates that two channels, namely the 
center (C) channel and the left channel (L), are to 
be handled in common as the left forward route 
channel, while two channels, namely the center (C) 
channel and the left (L) channel, are to be handled 
in common as right forward route channels. These 
correspond to the analyzers for common handling 
122h and the analyzers for common handling 122i, 
respectively. 

Thus the encoder of the embodiment illustrated 
allows to diminish the alien hearing feeling caused 
by handling in common not only by selecting the 
channels to be handled in common which exploits 
data characteristics but also by utilizing combina- 
tions of common handling which take advantage of 
features of targeted playback environment. Al- 
though the combinations of handling in common 
shown in Fig.4 are the best of those thought to be 
effective, any other combinations may be envis- 
aged within the scope of the present invention. 

The techniques for handling in common, shown 
in Fig.4, may also be changed between sound 
frames as pre-set frames by the relation indicated 
by arrows shown in Fig. 5. That is, A to E in Fig. 5 
correspond to A to E in Fig.4. specifying that the 
techniques connected by arrows are changed from 
one sound frame to another. If. for example, the 
state devoid of a common channel (NONE) is se- 
lected in a sound frame, the techniques C, D or E 
can be selected at the next sound frame. If the 
technique D is selected, the technique A, E or 
NONE becomes selectable at the next following 
sound frame. 

Conversely, the absence of an arrow from 
NONO to A or B indicates that direct selection of A 
or B as the next sound frame from the sound frame 
NONE indicating the absence of the common chan- 
nel is inhibited. Sound frame localization may be 
improved by affording a pre-set relation in selection 
of the technique for common handling among 
sound frames. 

As the technique for common handling, it is 
possible to cause plural techniques of common 
handling to exist within one sound frame. 

It may be contemplated to analyze audio sig- 
nals of respective channels from one specific fre- 
quency band to another and to select the technique 
of common handling from one specific frequency 
band to another. That is, such an artifice may be 
used in which frequency converting means for con- 
verting time-domain signals into frequency-dornain 
signals are provided in the inside of the analyzers 
for common handling 122 or at an upstream side of 
the input terminal 121 and the resulting frequency- 
domain signals are analyzed from one specific 
frequency band to another for selecting the tech- 
nique of common handling based upon the results 
of analyses. This enables such a processing in 



which common handling is realized by using the 
technique of common handling of all channels for 
the high frequency range as shown at A in Fig.4 
and the technique of common handling of left and 
5 right channels for the mid frequency range as 
shown at C in Fig.4, thereby realizing more effec- 
tive common handling. 

In addition, more effective common handling 
may be achieved by common handling of e.g., left 
io (L) channel and left surround (SL) channel and by 
common handling of e.g., right (R) channel and 
right surround (SR) channel within the same speci- 
fied frequency range of the same sound frame. 
In addition, assuming that there exist two chan- 
is nels each having a double power in the center 
channel (C). this channel may be divided and the 
signals of the two channels may be recorded in 
each of the left and left surround channels handled 
in common and the right and right surround chan- 
20 nels handled in common for raising the efficiency 
of common handling. Furthermore, the encod- 
ing/decoding technique of separating the signal 
component into tonal components and noise com- 
ponents and encoding/decoding them as proposed 
25 by the present Assignee in our previous Interna- 
tional Application No. PCT/JP94/00880, date of in- 
ternational application May 31 1994. may be used 
so that all channels are handled in common (A in 
Fig.4) for the noise components and the left and 
30 right channels are handled in common for the tonal 
components. 

Rg.6 shows a configuration of a decoder (de- 
coding apparatus) for carrying out the decoding 
method of the present invention. The decoder^ of 
35 the embodiment illustrated is such a decoder in 
which multi-channel decoding is implemented us- 
ing plural single-channel decoding units, such as 
the decoding units corresponding to the above- 
mentioned ATRAC system. 
40 The decoder of the embodiment illustrated is 

such a decoder in which part or all of the digital 
audio signals of plural channels are handled as 
signals of one or more channels handled in com- 
mon, and in which encoded digital audio signals of 
45 plural channels, including signals for which the 
combinations of the channels handled in common 
have been changed responsive to the targeted 
playback environment and frequency characteris- 
tics of the digital audio signals, are decoded using 
so the parameter information of common handling 
used for encoding, as shown in Fig.6. The decoder 
includes decoding units I33f and I33g, a distribu- 
tor of data for handling in common 134 and syn- 
thesizers of data handled in common 135a to 135e. 
55 The decoding units decode the signals handled in 
common, distribute the signals decoded and han- 
dled in common among plural channels responsive 
to the parameter information for common handling 
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and synthesize the signals with the signals of re- 
spective channels decoded but not handled in 
common. 

Fig/7 shows, for comparison with the decoder 
of the present embodiment, a configuration of a 
multi-channel decoder which effects channel-based 
decoding, that is a multi-channel decoder not em- 
ploying the present invention. The portions cor- 
responding to those of Figs.1 and 2 are denoted by 
the same reference numerals and the description 
therefor is omitted for ease of understanding. 

Referring to Fig.7, before explanation of Fig.6. 
the encoded bitstream, entered at an input terminal 
131, are demultiplexed by a demultiplexer 132 into 
encoded data of respective channels, which are 
routed to decoding units 133a to 133e. The data is 
decoded by decoding units 133a to 133e as later 
explained to decoded audio data which is outputted 
at output terminals 136a to 136e. 

On the other hand, a bitstream entering the 
input terminal 131 of the decoder of the embodi- 
ment of Fig.6 is supplied to the demultiplexor 132. 
Since the bitstream contains data specifying the 
channels handled in common (parameter informa- 
tion of handling in common) along with channel- 
based encoded data and data handled in common, 
the demultiplexor 132 divides the encoded data 
and the information on the parameters of handling 
in common from channel to channel and transmits 
resulting data to the decoding units 133a to 133g. 

The decoding units 133f and 133g, associated 
with the channels of the data handled in common, 
output decoded data handled in common and de- 
coded parameter information for common handling 
to the distributor for data handled in common 1 34. 
The distributor for data handled in common 134 
formulate data for respective channels from one or 
more common-handled data, with the aid of the 
information on the parameters for handling in com- 
mon, and distributes the data to respective chan- 
nels. 

The synthesizers 135a to 135e of common- 
handled data 135a to 135e synthesize outputs of 
the channel-based decoding units 133a to 133e 
and an output of the distributor of data handled in 
common 134 and output the resulting data at out- 
put terminals 136a to 136e as decoded data of 
respective channels. 

Thus the decoder of the embodiment illustrated 
formulates data of respective channels from one or 
plural data handled in common, based upon the 
information of the common-handling parameters, 
using the distributor of data handled in common 
134, and synthesizes the data with data of respec- 
tive channels not handled in common, using the 
synthesizers of common-handled data 135a to 
135e for decoding digital signals of plural channels. 
Decoding may be achieved by the sole distributor 



of common-handled data 134 taking charge of dis- 
tribution even if there exist plural sorts of channels 
made up of common-handled data or plural meth- 
ods of handling un common or even if data of a 
5 specific channel is divided into plural sorts of com- 
mon data which are then encoded in plural chan- 
nels handled in common. 

Fig.8 shows a configuration of a decoder in 
which common handling of multi-channel data does 
io not depart from the single-channel encoding sys- 
tem employed in the embodiment illustrated and 
data which are not handled in common in the 
encoded state may be synthesized with data han- 
dled in common. 
is Referring to Fig.8 f the demultiplexor 132 di- 

vides the bitstream entering the input terminal 131 
into channel-based data not handled in common on 
one hand and data handled in common and the 
information on the parameters for common han- 
20 dling, on the other hand, and transmits the channel- 
based data not handled in common to common- 
encoded data synthesizing units 138a to 138e, 
while transmitting data handled in common and the 
information on the parameters for common han- 
25 dling to a distributor for data handled in common 
137. 

The distributor for data handled in common 
137 formulates channel-based data from one or 
plural data handled in common, using the informa- 
30 tion on the parameters for common handling, and 
distributes the channel-based data to the respec- 
tive channels. 

The common handled data, outputted by the 
distributor for data handled in common 137 and 
35 distributed to the respective channels, and the 
channel-based data not handled in common, out- 
putted by the demultiplexor 132, are routed to 
associated common-encoded data synthesizers 
138a to 138e. The common coded data synthesiz- 
40 ers 138a to 138e synthesize the data supplied 
thereto and output them as encoded data. 

The decoding units 133a to 133e of the next 
stage decode outputs of associated common-en- 
coded data synthesizers 138a to 138e. The outputs 
45 of associated common-encoded data synthesizers 
138a to 138e are issued as channel-based data at 
associated output terminals 136a to 136e. 

Since the distributor of data handled in com- 
mon 134 and the synthesizers of data handled in 
so common 135a to I35e are provided upstream of 
the decoding unit 133 in the decoder of the em- 
bodiment illustrated, the decoder may be reduced 
in size. 

In addition, in the encoding method and ap- 
55 paratus of the present invention, the following tech- 
nique taking advantage of the channel reproducing 
environment may be utilized as a technique en- 
abling reproduction producing an alien feeling on 
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the hearing sense to a lesser extent. 

First, the technique of altering the method of 
handling common data responsive to the recom- 
mended audio data reproducing environment is ex- 
plained. 

That is, if the five channels shown in Fig.4A are 
to be handled in common, the signals of the center 
(C) channel, left (L) channel; right (R) channel, left 
surround (SL) channel and the right surround (SR) 
channel are level-converted at a ratio of 
C:L:R:SL:SR = 1 .0000:0.7071 :0.7071 :0.5000:0.5000 
and subsequently synthesized. For reproduction, 
the data is reproduced from a sole channel or 
distributed at the same ratio to all channels for 
achieving effective common handling. 

If the center (C) channel, left (L) channel and 
the left surround (SL) channel on one hand and the 
center (C) channel, right (R) channel and the right 
surround (SR) channel on the other hand, as shown 
in Fig.4B, are to be handled in common, the ratios 
of C:L:SL = 0.7071:1.0000:0.7071 and C:R:SR = 
0.7071:1.0000:0.7071 may be employed for effec- 
tive common handling. 

If the left (L) channel and the left surround (SL) 
channel on one hand and right (R) channel and the 
right surround (SR) channel on the other hand as 
shown in Fig.4C are to be handled in common, the 
ratios of L:SL = 1.0000:0.7071 and R:SR = 
1.0000:0.7071 may be employed for effective com- 
mon handling. 

If the left (L) channel, center channel (C) and 
the right surround (R) channel on one hand and the 
left surround (SL) channel and the right surround 
(SR) channel on the other hand, as shown in 
Fig.4D, are to be handled in common, the ratios of 
C:L:R = 1.0000:0.7071:0.7071 and SLSR = 
0.7071 :0.7071 may be employed for effective com- 
mon handling. 

If the left (L) channel and the center (C) chan- 
nel on one hand and right (R) channel and the 
center (C) channel on the other hand as shown in 
Fig.4E are to be handled in common, the ratios of 
C:L = 0.7071:1.0000 and C:R = 0.7071:1.0000 
may be employed for effective common handling. 

The above ratios are optimum values as found 
by experiments conducted by the present inventors 
and may assume different values in future experi- 
ments. 

The above-described encoding method and ap- 
paratus may also be modified so that data for 
restoration to the data previous to common han- 
dling will be contained in the code in addition to 
the information of common handling in order to 
enable reproduction not producing alien feeling to 
the hearing sense. 

Figs.9 and 10 show the configurations for im- 
plementing the method of extracting parameters for 
reproducing data used for common handling of 



respective channels from data handled in common. 
Fig.9 shows a configuration corresponding to the 
portions responsible for common handling process- 
ing in Fig.1, while Fig. 10 shows a configuration in 
5 which common handling parameter extractor 141 is 
added to the configuration of Fig.9. 

Figs.11 and 12 show the configuration of im- 
plementing the method of adjusting the data han- 
dled in common using the parameter contained in 
10 the code. Fig. 11 shows a configuration correspond- 
ing to the portion of Fig. 6 distributing the common 
handled data to the respective channels and Fig. 12 
shows a configuration in which a common handling 
parameter adjustment unit 142 is added to Fig.1 1. 
75 In Figs.9 to 12, the constituent elements cor- 

responding to those shown in Figs.1 and 6 are 
denoted by the same reference numerals and the 
detailed description is omitted for simplicity. 

Referring first to Fig.9, data of respective chan- 
20 nels are supplied to input terminals 101 and 101a 
to 101e. The data outputted by the common han- 
dling analyzer 102 and entering a common-handled 
data take-out unit 103 and the common handling 
data formulator 104 is the data used for common 
25 handling with the number of such data being equal 
to the number of channels of sound source data. If 
the data of a given channel is not used for common 
handling, the information specifying that such data 
is not employed is transmitted. On the other hand, 
30 data outputted by the common handling data for- 
mulator 104 and sent to the demultiplexer 106 is 
the data handled in common, with the number of 
the data being equal to the number of channels 
handled in common. The number of the channels 
35 handled in common is varied in dependence upon 
the method of common handling. 

On the other hand, data distributed by the 
common handling data distributor 124 to respective 
channels is sent to the common handling data 
40 synthesizer 135 and synthesized with the data en- 
coded from channel to channel so as to be output- 
ted as channel-based decoded data. 

In the configuration of Fig.9, the common han- 
dling parameter extractor 141 may be provided to 
45 each channel, as shown in Fig.10. To the common 
handling parameter extractor 141 are entered data 
and information used for channel-based common 
handling and all data and information handled in 
common. The common handling parameter extrac- 
50 tor 141 analyzes the dependency of the channel 
with respect to the data of the channels handled in 
common and the technique of resetting the com- 
mon handling in order to find the scale parameters 
of the channel under consideration for each fre- 
55 quency band or a set of frequency bands used for 
encoding as a unit and a scale parameter ratio 
between respective frequency bands in the channel 
under consideration. These values are sent to the 
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multiplexor 106 so as to be contained as the 
codes. 

In the decoder of Fig.1 1, the common handling 
parameter adjustment unit 142 may be provided to 
respective channels, as shown for the decoder of 5 
Fig. 12. To the common handling parameter adjust- 
ment unit 142 are entered common handling data 
for a channel under consideration, outputted from 
the common handling data distributor 134 and the 
common handling parameter information for a ic 
channel under consideration outputted by the mul- 
tiplexor 132. By employing these data, the com- 
mon handling parameter adjustment unit 142 modi- 
fies the common handling data by exploiting the 
technique for canceling the common handling or 75 
dependency of the data of the channel under con- 
sideration among the channels handled in common. 
By this, the sound field closer to the original signal 
data of the channel under consideration than 
achievable with the decoder of Fig.1 may be repro- 20 
duced. 

Although this system may be employed as an 
independent system irrelevant to the channel re- 
producing environment, it is possible to formulate 
more effective common handling parameters by 25 
analyzing the common handling parameters by tak- 
ing advantage of data and the reproducing environ- 
ment since data dependency degree can be pre- 
dicted if the playback environment is specified. 

For enabling reproduction by the encoding 30 
method and apparatus of the embodiment illus- 
trated which does not evoke alien feeling to the 
hearing sense, there is an encoding method which 
exploits the time change information between 
sound frames in the selection of channels to be 35 
handled in common or the method for common 
handling processing. 

The sound frame means an audio data pro- 
cessing unit for encoding and decoding and is 512 
samples for the sampling frequency of 44.1 kHz for 40 
the ATRAC system employed in the present em- 
bodiment. 

Meanwhile, it is possible with the encoding 
method and apparatus of the present embodiment 
to alter the selection of channels to be handled in 45 
common or the method for common handling pro- 
cessing from one sound frame to another. If the 
selection of optimum channel or processing meth- 
od is done within each sound frame, it may occur 
that channel or processing method selection is 50 
varied from one sound frame to another to produce 
an alien hearing feeling due to such variation. 

Thus, in the channel selection, such alien hear- 
ing feeling may be prohibited from occurring by 
monitoring the transition of selection from one 55 
sound frame to another to avoid continuation of 
common handling and non-common handling 
throughout the channels or to limit variation in the 



selection under the stationary state in which there 
is little change in input data. 

In the selection of the method for common 
handling processing, frequent switching on the 
sound frame basis is not advisable since the dif- 
ference in sound quality due to difference in the 
processing method is large as compared to the 
case of channel selection and also since the en- 
coder and the decoder then need to be modified in 
dependence upon the processing method. Thus it 
is advisable to effect switching in terms of several 
sound frames at the minimum. 

A modification of the present invention is 
hereinafter explained. 

In the previous embodiment, channel selection 
is made from among five channels, namely the 
center (C), left (L), right (R), left surround (SL) and 
right surround (SR) channels. Fig.13 shows selec- 
tion from among seven channels, namely the 
above five channels plus left center (CL) and right 
center (CR) channels. 

Fig.13A shows common handling of the totality 
of channels. 

Fig.1 3B shows common handling of three 
channels, namely center (C). left center (CL), left 
(L) and left surround (SL) channels, as left route 
channels, and common handling of four channels, 
namely center (C), left center (CR), right (R) and 
right surround (SR) channels, as right route chan- 
nels. 

Fig.13C shows common handling of three 
channels, namely left center (CL), left (L) and left 
surround (SL) channels, as left route channels, and 
common handling of three channels, namely right 
center (CR), right (R) and right surround (SR) chan- 
nels, as right route channels. 

Fig.13D shows common handling of five chan- 
nels, namely center (C), left center (CL), left (L), 
right center (CR) and right (R) channels, as forward 
route channels, and common handling of two chan- 
nels, namely left center (CL) and right center (SR), 
as backward route channels. 

Fig.1 3E shows common handling of three 
channels, namely center (C), left center (CL) and 
left (L), as left forward route channels, and common 
handling of three channels, namely center (C). right 
center (SR) and right (R), as right forward route 
channels. 

Fig.l3F shows common handling of two chan- 
nels, namely left center (CL) and left (L), as left 
forward route channels, and common handling of 
two channels, namely right center (CR) and right 
(R), as right forward route channels. 

In the present embodiment, suitable selection 
of channels for common handling may be achieved 
by preferential processing in which the highest 
priority is put on the technique of Fig.1 3A capable 
of common handling of the largest number of chan- 
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nels and the lowest priority is put on the technique 
of Fig i3F. 

By exploiting the combinations of common 
handling of channels as described above, it be- 
comes possible to reduce he alien feeling invoked 
due to common handling in the case of seven 
channels. 

In Fig.3, a sub-woofer (SW) channels may fur- 
ther be annexed to provide an eight-channel sys- 
tem. However, the sub-woofer (SW) channel is tar- 
geted for low frequency reproduction and hence is 
not suited for common handling. Thus the channel 
many bo annexed to the system without taking part 
in the common channel handling. 

The illustrative construction and operation of 
the encoding unit 105 will be explained by referring 
to Figs.14 to 17. Fig. 14 shows a construction of an 
encoding unit 105 for one channel. 

In Fig. 14. audio data corresponding to the origi- 
nal data from which common-handled portions 
have been taken out by the common-handling data 
take-out unit 103. that is sampled and quantized 
audio data, are supplied to an input terminal 24. 
The signals led to the input terminal 24 are split by 
a spoctrum splitting filter 401 into time-domain 
signal components in three frequency bands, 
namely a low frequency band of 0 to 5.5 kHz, a 
mid frequency band of 5.5 kHz to 11 kHz, and a 
high frequency band of not lower than 1 1 kHz, that 
is 11 kHz to 22 kHz 

Of the signal components of these three fre- 
quency bands Irom the spectrum-splitting filter 
401. those of the low frequency band, mid-fre- 
quency band and the high-frequency band are sent 
to MOCT circuits 402L, 402M and 402H, respec- 
tively, so as to be resolved into frequency-domain 
signal components. The time block length for 
MOCT may be varied from one frequency band to 
another, such that, in the signal portion where 
signal components are changed steeply, the time 
block length is reduced to raise time resolution, 
whereas, in the stationary signal portion, the time 
block length is increased for effective transmission 
of signal components and for controlling the quan- 
tization noise. 

The time block length is determined by a block 
size evaluator 403. That is, the signal components 
of the three frequency bands from the spectrum- 
splitting filter 401 are also sent to the block size 
evaluator 403 which then determines the time block 
length for MDCT and transmits the information 
specifying the thus set time block length to the 
MDCT circuits 402L, 404M and 402H. 

Of two time block lengths for varying the time 
block lengths for MDCT, the longer time block 
length is termed a long mode and corresponds to 
the time duration of 11.6 msec. The short block 
length is termed a short mode and raises the time 



resolution up to 1.45 ms and to 2.9 ms for the low 
range of up to 5.5 kHz and for the mid range of 
from 5.5 to 1 1 kHz, respectively. 

The audio signals thus resolved into signal 

5 components on two-dimensional time-frequency 
areas, termed block floating units, are divided by 
normalization circuits 404L, 404M and 404H into a 
sum total of 52 block floating units in the low 
range, mid range and in the high range, while 

10 being normalized from one block floating unit to 
another by way of setting scale factors. 

The bit allocator 405 analyzes, by exploiting 
the psychoacoustic characteristics of the human 
auditory system, of which components the audio 

75 signals are constituted. The results of analyses are 
sent to a re-quantizer 406 also fed with unit-based 
signals from the normalization circuits 404L to 
404H. 

The re-quantizer 406 finds, based upon the 

20 results of analyses, the quantization steps for re- 
quantization of the respective units, and formulates 
corresponding parameters, that is decides the word 
lengths, while carrying out the re-quantization. 

Finally, a formatter 407 assembles unit-based 

25 parameter information data and re-quantized fre- 
quency-domain signal components into a bitstream 
for one channel sent to the multiplexor 106 of Fig.1 
in accordance with a pre-set format. An output of 
the formatter 407 is issued as the bitstream at an 

30 output terminal 25. 

The bitstream is recorded on a recording me- 
dium, such as an optical disc or a motion picture 
film, by a recorder configured for effecting error 
correction or modulation. 

35 The above-described encoding operation is 

carried out n terms of a sound frame as a unit. 

The bit allocator 405 is configured as shown 
specifically in Fig. 15. 

Referring to Fig.15, the frequency-domain sig- 

40 nal components, called hereinafter as data, are 
sent to an input terminal 521 from the MDCT 
circuits 402L, 402M and 402H. 

The frequency-domain spectral data is trans- 
mitted to a band-based energy calculating circuit 

45 522 in which the energies of the critical bands are 
found by calculating the sum total of the squared 
amplitudes of the spectral components in the re- 
spective bands. The amplitude peak values or 
mean values may also be employed in place of 

so signal energy in the respective bands. Each spec- 
tral component indicating the sum value for each of 
the respective bands is indicated as Bark spectrum 
SB in Fig. 16 as an output of the energy calculating 
circuit 522. In Fig.16, 12 bands B1 to B12 are 

55 shown as indicating the critical bands for simplify- 
ing the drawing. 

It is noted that an operation of multiplying each 
spectral component SB by a pre-set weighting 
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function for taking into account the effects of mask- 
ing is performed by way of convolution. To this 
end, an output of the band-based energy calculat- 
ing circuit 522, that is each value of the Bark 
spectral component SB, is transmitted to a con- 5 
volution filter circuit 523. The convolution filter cir- 
cuit 523 is made up of a plurality of delay elements 
for sequentially delaying input data, a plurality of 
multipliers, such as 25 multipliers associated with 
the respective bands, for multiplying outputs of the 10 
delay elements with filter coefficients or weighting 
functions, and an adder for finding the sum of the 
outputs of the respective multipliers. 

The masking means the phenomenon in which 
certain signals are masked by other signals and 75 
become inaudible due to psychoacoustic character- 
istics of the human hearing sense. The masking 
effect may be classified into the time-domain 
masking effect produced by the time-domain audio 
signals and concurrent masking effect produced by 20 
the frequency-domain signals. By this masking, 
any noise present in a masked portion becomes 
inaudible. In actual audio signals, the noise within 
the masked range is an allowable noise. 

By way of a concrete example of multiplication 25 
coefficients or filter coefficients of the respective 
filters of the convolution filter circuit 523. if the 
coefficient of a multiplier M for an arbitrary band is 
1 , outputs of the delay elements are multiplied by 
coefficients 0.15, 0.0019, 0.0000086, 0.4, 0.06 and 30 
0.007 at the multipliers M-1, M-2, M-3, M + 1, M + 2 
and M + 3, M being an arbitrary integer of from 1 to 
25, by way of performing convolution of the Bark 
spectral components SB. 

An output of the convolution filter circuit 523 is 35 
transmitted to a subtracter 524 which is employed 
for finding a level a corresponding to the allowable 
noise level in the convolved region. Meanwhile, the 
allowable noise level a is such a level which will 
give an allowable noise level for each of the critical 40 
bands by deconvolution as will be described sub- 
sequently. The subtractor 524 is supplied with an 
allowance function (a function representative of the 
masking level) for finding the level. aThe level a is 
controlled by increasing or decreasing the allow- 45 
ance function. The allowance function is supplied 
from a (N - ai) function generator 525 as will be 
explained subsequently. 

That is, the level a corresponding to the allowa- 
ble noise level is found from the equation: 50 

a = S - (n - ai) 

where i is the number accorded sequentially to the 
critical bands beginning from the lower side, n and 55 
a are constants where a > 0 and S the intensity of 
the convolved Bark spectrum. In the equation (1), 
(n - ai) represents the allowance function. The 



values n and a may be set so that n = 38 and a = 
0.5. 

The level a is found in this manner and trans- 
mitted to a divider 526 for deconvolving the level a 
in the convolved region. By this deconvolution, the 
masking threshold is found from the level a. This 
masking threshold becomes the allowable noise 
level. Although the deconvolution necessitates 
complex arithmetic-logical steps, it is performed in 
the present embodiment in a simplified manner by 
using the divider 526. 

The masking threshold signal is transmitted via 
a synthesizing circuit 527 to a subtractor 528 which 
is supplied via a delay circuit 529 with an output of 
the band-based energy detection circuit 22, that is 
the above-mentioned Bark spectral components 
SB. The subtractor 528 subtracts the masking 
threshold signal from the Bark spectral components 
SB so that the portions of the spectral components 
SB lower than the level of the masking threshold 
MS are masked. The delay circuit 529 is provided 
for delaying the signals of the Bark spectral com- 
ponents SB from the energy detection circuit 522 
in consideration of delay produced in circuitry up- 
stream of the synthesis circuit 527. 

An output of the subtractor 528 is outputted via 
an allowable noise correction circuit 530 at an 
output terminal 531 so as to be transmitted to a 
ROM, not shown, in which the information concern- 
ing the number of the allocated bits is stored 
previously. The ROM outputs the information con- 
cerning the number of allocated bits for each band, 
depending on an output of the subtraction circuit 
528 supplied via an allowable noise correction cir- 
cuit 530. 

The information concerning the number of the 
allocated bits thus found is transmitted to a re- 
quantizer 406 of Fig. 14 to permit the frequency- 
domain data from the MDCT circuits 494L to 404H 
to be quantized in the re-quantizer 406 with the 
numbers of bits allocated to the respective bands. 

In sum, the re-quantizer 406 quantizes the 
band-based data with the number of bits allocated 
in dependence upon the difference between the 
energy or peak values of the critical bands or sub- 
bands further divided from the critical bands for a 
higher frequency and an output of the above-men- 
tioned level setting means. 

The synthesizing circuit 527 may also be de- 
signed to synthesize the masking threshold MS 
and data from the minimum audibility curve RC 
from the minimum audibility curve generating cir- 
cuit 532 representing psychoacoustic characteris- 
tics of the human hearing sense as shown in 
Rg.17. If the absolute noise level is lower than the 
minimum audibility curve RC, the noise becomes 
inaudible. 
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The minimum audibility curve differs with the 
difference in the playback sound level even al- 
though the encoding is made in the same manner. 
However, since there is no marked difference in the 
manner of the music entering the 16-bit dynamic 
range in actual digital systems, it may be pre- 
sumed that, if the quantization noise of the fre- 
quency range in the vicinity of 4 kHz most percep- 
tible to the ear is not heard, the quantization noise 
lower than the level of the minimum audibility curve 
is not heard in any other frequency range. 

Thus, if the recording/ reproducing device is 
employed so that the noise in the vicinity of 4 kHz 
is not heard, and the allowable noise level is to be 
obtained by synthesizing the minimum audibility 
curve RC and the masking threshold MS, the al- 
lowable noise level may be up to the level in- 
dicated by hatched lines in Fig.17. In the present 
embodiment, the level of 4 kHz of the minimum 
audibility curve is matched to the minimum level 
corresponding to e.g., 20 bits. In Fig.17, the signal 
spectrum SS is also shown. 

Besides, the allowable noise correction circuit 
530 corrects the allowable noise level in the output 
of the subtractor 528 based on the information of 
the equal-loudness curve transmitted from a cor- 
rection information outputting circuit 533. The 
equal-loudness curve is a characteristic curve con- 
cerning psychoacoustic characteristics of human 
hearing sense, and is obtained by finding the 
sound pressures of the sound at the respective 
frequencies heard with the same loudness as the 
pure tone of 1 kHz and by connecting the sound 
pressures by a curve. It is also known as an equal 
loudness sensitivity curve. The equal-loudness 
curve also delineates a curve which is substantially 
the same as the minimum audibility curve shown in 
Fig.17. 

With the equal-loudness curve, the sound in 
the vicinity of 4 kHz is heard with the same loud- 
ness as the sound of 1 kHz, even although the 
sound pressure is decreased by 8 to 10 dB from 
the sound of 1 kHz. Conversely, the sound in the 
vicinity of 10 kHz cannot be heard with the same 
loudness as the sound of 1 kHz unless the sound 
pressure is higher by about 15 dB than that of the 
sound of 1 kHz. Thus it may be seen that, in the 
allowable noise correction circuit 530, the allowable 
noise level preferably has frequency characteristics 
represented by a curve conforming to the equal- 
loudness curve. Thus it may be seen that correc- 
tion of the allowable noise level in consideration of 
the equal-loudness curve is in conformity to 
psychoacoustic characteristics of the human hear- 
ing sense. 

Fig.18 shows an illustrative configuration of the 
decoding unit 133 of Fig.6 corresponding to the 
encoding unit 105 of Fig.1. 



The decoding unit of Fig.18 decodes encoded 
signals for one of plural channels read by re- 
producing means, such as a magnetic head or an 
optical head, from a recording medium, such as an 
5 optical disc or a motion picture film as later ex- 
plained. 

In Fig.18, encoded data from the demultiplexer 
132 of Fig.5 is fed to a terminal 26 and thence fed 
to a deformatter 41 1 . The deformatter performs an 

10 operation which is reverse to that performed by the 
formatter 407, in order to produce the unit-based 
parameter information and the re-quantized fre- 
quency-domain signal components, that is quan- 
tized MDCT coefficients. 

75 The unit-based quantized MDCT coefficients 

from the deformatter 411 are sent to a decoding 
circuit for the low frequency range 41 2L, a decod- 
ing circuit for the mid frequency range 41 2M and 
to a decoding circuit for the high frequency range 

20 412H. These decoding circuits 412L to 412H are 
also fed with the parameter information from the 
deformatter 411. Using the parameter information, 
the decoding circuits 41 2L to 41 2H perform decod- 
ing and cancellation of bit allocation. 

25 Outputs of these decoding circuits 41 2L to 

41 2H are sent to associated I MDCT circuits 41 3L 
to 41 3H. The IMDCT circuits 41 3L to 41 3H are also 
fed with the parameter information and transform 
the frequency-domain signal components into time- 

30 domain signal components. These partial-range 
time-domain signal components are decoded by a 
band-synthesis circuit 414 to full-range signals/ 

An instance of recording of data encoded by 
the encoding method and apparatus of the present 

35 embodiment on a motion picture film as an exam- 
ple of the recording medium is explained by refer- 
ring to Fig.19. 

That is, the encoded data is recorded on a 
motion picture film 1 shown in Fig.19. The record- 

40 ing positions of the encoded data on the motion 
picture film 1 may be exemplified by recording 
regions 4 between perforations 3 of the motion 
picture film 1, as shown in Fig.19a, recording re- 
gions 4 between the perforations 3 on the same 

45 side of the film 1 as shown in Fig.19b, longitudinal 
recording regions 5 between the perforations 3 and 
the longitudinal edge of the film 1, as shown in 
Fig.l9c, and by both the longitudinal recording 
regions 5 between the perforations 3 and the lon- 

50 gitudinal edge of the film 1 and recording regions 4 
between perforations 3, as shown in Fig.19d. 

By referring to Fig.20, an instance of recording 
of data encoded by the encoding method and 
apparatus of the present embodiment on an optical 

55 film as an example of the recording medium is 
explained. 

Fig.20 shows an example of header data of 
each channel, as a part of the encoded bitstream. 



13 



25 



EP0 688 113 A2 



26 



employed in practicing the present invention. 

The header data is made up of several flags, 
the state of 1/0 of which specifies various con- 
ditions concerning the next following bitstream. 
Only part of the bitstream is disclosed herein and 5 
description on the conditions not having direct per- 
tinence to the present invention is omitted. 

The common channel handling mode is speci- 
fied by a flag cplcpf. [su] and [ch] indicate the 
sound frame number and the channel number, re- 10 
spectively. The flag cplcpf is a 4-bit code and can 
be defined for up to a maximum of four items. If 
there is no n 1" in any bits, that is if the flag is 
defined as "0000", it specifies that there is no data 
handled in common in the bitstream of the channel. 75 

If the mode of handling all channels in common 
is selected, the Hags cplcpf of all channels are set 
to "1000" and the all channel common handle data 
is entered in the first acbs (data handled in com- 
mon). 2Q 

If the left route channel common handle mode 
or the right route channel common handle mode is 
selected, the flag cplcpf of each channel selected 
for common handling of the left route channels is 
set to •1000-. while the flag cplcpf of each channel 25 
selected for common handling of the right route 
channels is set to "0100". The left route channel 
common handling data is entered in the first acbs, 
while the right route channel common handling 
data is entered in the second acbs. 30 

That is. which acbs is to be used for each 
channel may be selected by the bits of the flag 
cplcpf. 

Thus the combinations can be varied by using 
the above-described encoded bitstream and the 35 
header data. 

The configuration of the encoded bitstream is 
shown schematically in Fig.21 in which reference 
number 150 denotes the header of the entire bit- 
stream, reference numbers 151 to 155 denote the 40 
data regions of each channel and reference num- 
bers 156 to 159 denote the common-handling data 
regions of four channels. 

The data regions 151 to 155 of each channel 
contains common-handling flag (cpl use flag) 160, 45 
common-handling parameters (CPL parameter) 161 
and data (real data) 162. The common-handling 
flag (cpl use flag) 160 is made up of 4 bits (cpll-4 
use bit) 170 to 173 as explained as cplchf in 

Fig.22 shows a modification of the common 
handling analyzer 102. 

In this figure, audio data of respective chan- 
nels, that is the center (C). left (L), right (R), left 
surround (SL) and right surround (SR) channels, 55 
fed via an input terminal 101, are fed to orthogonal 
transform units 201a to 201 e where they are trans- 
formed into frequency-domain signal components 



which are outputted. 

The frequency characteristics evaluators 202a 
to 202e find, based upon frequency-domain signal 
component data for respective channels from the 
orthogonal transform circuits 201a to 201 e, the 
parameters of the psychoacoustic characteristics of 
the human hearing sense, such as minimum au- 
dibility curve or masking threshold, and output the 
results along with frequency-domain signal compo- 
nent data. 

A common-handling processing selector 203 
selects, based upon the data on common handling, 
as obtained by evaluation by the frequency char- 
acteristics evaluators 202a to 202e and the target 
bit rate for encoding, such frequency range for 
which the absolute level of the quantization noise 
generated by common handling becomes lower 
then the minimum audibility curve. This renders the 
quantization noise resulting from common handling 
becomes inaudible. The results of selection are 
outputted at an output terminal 204 and thence 
supplied to the common handling data extractors 
103a to 103e and the common handling data for- 
mulator 104. The data on common handling, out- 
putted at the output terminal 124. is outputted in 
terms of a pre-set frame, such as a sound frame, 
as a unit. 

With the above-described encoder of the 
present embodiment, alien hearing feeling other- 
wise invoked by common handling may be re- 
duced by selecting the common handling frequen- 
cy exploiting data characteristics. 

The common handling technique carried out by 
the common handling processing selector 203 may 
be changed between sound frames as pre-set 
frames. Thus, by selecting the optimum common- 
handling frequency range from one sound frame to 
another, it becomes possible to suppress changes 
in the sound field otherwise produced by common 
handling. 

It is also possible for plural selections of com- 
mon handling processing in one sound frame, as 
shown in Fig.22. For example, if the frequency 
range for which the above-mentioned absolute level 
becomes lower than the minimum audibility curve 
is enhanced by independently encoding one or 
plural channels, without encoding in common, in a 
particular frequency range, common handling' may 
be rendered more effective by effecting common 
handling processing in which the combinations of 
the channels handled in common in the pre-set 
frequency range are changed. 

Alternatively, it is possible for plural combina- 
tions of common handling simultaneously in a par- 
ticular frequency range. For example, in the case 
of data in which it is more effective to separately 
handle the forward route channels and the back- 
ward route channels (surround channels), plural 
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common channels having the same or different 
common-handled frequency ranges may be formu- 
lated for effecting common handling suited to data 
characteristics. 

The above-described encoding and decoding 
methods of the present invention may be applied 
not only to the ATRAC system described in the 
embodiments but also to any other encoding sys- 
tem, above all, the encoding system in which time- 
domain signals are transformed by orthogonal 
transform into the frequency-domain signals. 

Claims 

1. An encoding method for encoding digital sig- 
nals of plural channels (101a, 101e) and 
outputting (107) the encoded digital signals 
and the parameter information for encoding, 
comprising the steps of 

handling the digital signals of at least a 
part of the channels in common to form a 
common digital signal, 

altering (103) the combinations of channels 
handled in common depending upon frequen- 
cy characteristics of the digital signals or the 
targeted playback environment. 

outputting (104) the parameter information 
specifying the combinations of the channels 
handled in common, 

encoding (105) and outputting the common 
digital signal, and 

multiplexing (106) the parameter informa- 
tion and the encoded output. 

2. The encoding method as claimed in claim 2, 
wherein the step of encoding (105) the com- 
mon-handling signal includes a sub-step of 
adaptively altering the processing depending 
upon the contents of the digital signals or the 
advisable playback environment. 

3. The encoding method as claimed in claim 2, 
wherein the operation of the sub-step of adap- 
tively altering the processing is performed in 
terms of a pre-set time frame as a unit. 

4. The encoding method as claimed in claim 1, 
wherein the operation of altering the combina- 
tions of channels handled in common is per- 
formed in terms of a pre-set time frame as a 
unit. ... ■ ■ 

5. The encoding method as claimed in claim 4, 
wherein a plurality of the combinations of the 
channels handled in common are used in one 
frame. 



6. The encoding method as claimed in any one of 
the preceding claims, wherein the common 
digital signal is a digital signal of one channel 
split and arrayed in at least two channels. 

5 

7. The encoding method as claimed in any one of 
claims 2 to 6, wherein, for at least one of the 
digital signals of plural channels (101a, ... 
101e), the information for regenerating the pre- 

10 common-handling signal is found and the in- 

formation is included in the information con- 
cerning common handling. 

8. An encoding apparatus for encoding digital sig- 
75 nals of plural channels (101a, ... 101e) and 

outputting the encoded digital signals and the 
parameter information for encoding, compris- 
ing 

means for handling the digital signals of at 
20 least a part of the channels in common to form 

a common digital signal, 

means (103) for altering the combinations 
of channels handled in common depending 
upon frequency characteristics of the digital 
25 signals or the targeted playback environment, 

means (104) for outputting the parameter 
information specifying the combinations of the 
channels handled in common, 

means (105) for encoding and outputting 
30 the common digital signal, and 

means for multiplexing (106) the parameter 
information and the encoded output. 

9- The encoding apparatus as claimed in claim 8, 
35 wherein said encoding means (105) alter the 

processing depending upon the contents of the 
digital signals or the advisable playback envi- 
ronment. 

40 10. The encoding apparatus as claimed in claim 8, 
wherein said encoding means (105) controls 
the alternation of the combinations of the chan- 
nels handled in common from one pre-set 
frame to another. 

45 

11. The encoding apparatus as claimed in claim 8, 
wherein said encoding means (105) controls 
the alternation of the combinations of the pro- 
cessing performed on signals to be handled in 

so common from one pre-set frame to another. 

12. The encoding apparatus as claimed in claim 8, 
wherein said encoding means (105) split and 
array the digital signals of an arbitrary channel 

55 among plural channels to be handled in com- 

mon. 
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13. The encoding apparatus as claimed in claim 8, 
wherein said encoding means (105) employ 
different sorts of combinations of channels to 
be handled in common in one pre-set frame. 

14. The encoding apparatus as claimed in claim 8, 
wherein said encoding means analyzes the 
information for regenerating the pre-common- 
handling signals for part or all of the digital 
signals of plural channels to be handled in 
common and accommodates said information 
in the information on common handling. 

15. A decoding apparatus for decoding encoded 
digital signals using parameters for encoding, 
said encoded digital signals being such signals 
in which part or all of digital signals of plural 
channels are handled as one or more common 
signals, with the combinations of channels for 
common handling being altered in dependence 
upon frequency characteristics of the digital 
signals and the targeted playback environment, 
comprising 

decoding means (133) for decoding the 
common signals, 

distributing means (134) for distributing the 
decoded common signals in dependence upon 
the combinations of common handling, and 

decoding means for restoring the decoded 
common signals of plural channels based upon 
the signals distributed and handled in com- 
mon. 



20. A recording medium (109) having recorded 
thereon such a signal in which part or all of 
digital signals of plural channels are handled 
as one or more encoded common signals, with 
5 the combinations of channels for common han- 

dling being altered in dependence upon fre- 
quency characteristics of the digital signals 
and the targeted playback environment, the 
parameter information specifying the combina- 
io tions of channels to be handled in common, 

and an encoded signal other than the common 
signals and the parameter information for en- 
coding are recorded along with the parameter 
information concerning the encoding. 

15 

21. The recording medium as claimed in claim 20, 
which is an optical disc (109) or a motion 
picture film (109). 

20 



25 



16. The decoding apparatus as claimed in claim 

15, wherein said decoding means (133) alter 35 
the processing depending upon the contents of 
the digital signals or the advisable playback 
environment. 



17. The decoding apparatus as clamed in claim 40 
15, wherein said decoding means (133) de- 
code encoded signals in which plural sorts of 
combinations of the channels handled in com- 
mon are used a plurality of number of times in 

the same pre-set frame. 45 

18. The decoding apparatus as claimed in claim 
15, wherein said decoding means (133) de- 
code signals of an arbitrary channel split and 
arrayed in plural common signals and en- 50 
coded. 



19. The decoding apparatus as claimed in claim 
17, wherein said decoding means (133) adjusts 
the signals of respective channels using the 55 
information for regenerating pre-common-han- 
dling signals contained in the information on 
common handling. 
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audio_block (su) 

for(ch=0; ch<NChannel (acmod[suj]) ; ch++) 

sflen [su] [ch] LengthNum(frmsizecod[su])bits uimsf 

cplin [su] fch] Ibitbslbf 
if (cplin [su] [ch] ==1) 

cp1_data (su, ch) 
abs (su, ch) 
for (cu=0; 

3 r/ NChannel(acmod[su]) \ ' ' \ 

CU< P ^ ( ( 2 cplchf[su] [ch] 5 2 p ? 1 : 0 ; 

cu++) 

JpW^ ^u S ( frmsizecod[su])blte ujms , 

cpled [su] [ch] sbits uimsf 
acbs(su, ch) 
^ dummy = 0 

if (Ifeon [su] = = 1 8 5 0<=acmod[su] <=23) 

sflenlfa [su] sbits uimsf 
abslfe (su.ch) 

) > 

cpl-data(su,ch) 

5S cM ^f ch 3 4bitsbslbf 

if (acmod[su] = = 1|| 28<=acmod[su] <= 31) 

B/2! 9 « [S ? J l ? h] Ibitbslbf 
if (phsflge[su] = = i) 

{ 

for (bd=cplcost[su] [ch]; bd<*cplcoed[su] [ch] ; bd++) 
phsflgfsu] [ch] [bd] Ibitbslbf 

> 

} 

} 

cplco (su, ch) 
dummy = o 
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